Add support for local models via Ollama #6

r-carroll · 2025-01-30T03:14:17Z

Add new system variable to drive branching logic between openai and ollama
update docs
update setup test
add llama library to requirements file

Testing Verification:
speed benchmarks using an M1 macbook pro

running with openai
running with ollama
running with ollama occasionally fails cache test, not sure if this should be investigated further or adjusted when running with llama

r-carroll · 2025-01-30T03:15:24Z

Ollama.md

@@ -0,0 +1,29 @@
+# Ollama setup


Keeping most ollama-specific instructions separate to be less intrusive

r-carroll · 2025-01-30T03:15:46Z

Ollama.md

+1. Download and install [Ollama](https://ollama.com/)
+2. Once Ollama is running on your system, run `ollama pull llama3.1`
+> Currently this is a ~5GB download, it's best to download it before the workshop if you plan on using it
+3. `ollama pull nomic-embed-text`


nomic is pretty small so no need to call this one out

r-carroll · 2025-01-30T03:16:22Z

Ollama.md

+When you get to the system prompt section of the workshop, llama requires that you are a bit more explicit with your instructions. If the prompt given in the main instructions doesn't work, try the following instead:
+
+```
+system_prompt = """


tried several iterations of system prompts. llama 3.1 seems to need very explicit instructions

r-carroll · 2025-01-30T03:16:57Z

Ollama.md

+
+You're now ready to begin the workshop! Head back to the [Readme.md](Readme.md)
+
+## Restarting the workshop 


This may bear further investigation, but in my tests it was best to kill and re-create

r-carroll · 2025-01-30T03:17:16Z

dot.env

+LANGCHAIN_PROJECT=
+MODEL_NAME=openai


defaulting to openai

r-carroll · 2025-01-30T03:17:49Z

example_agent/utils/ex_nodes.py


 @lru_cache(maxsize=4)
 def _get_tool_model(model_name: str):
    if model_name == "openai":
        model = ChatOpenAI(temperature=0, model_name="gpt-4o")
+    elif model_name == "ollama":
+        model = ChatOllama(temperature=0, model="llama3.1", num_ctx=4096)


increasing the context from the default (which is pretty low) provided much more reliable results

r-carroll · 2025-01-30T03:18:51Z

example_agent/utils/ex_vector_store.py

@@ -18,9 +19,34 @@


 def get_vector_store():
+    if os.environ.get("MODEL_NAME") == "ollama":


This method is pretty verbose, and I'm not the biggest fan of it as-is, so open to suggestions on what might make it better.

I don't think the embedding model should be coupled to the LLM model in use. For the vector store, you could use whatever embedding model you'd like you don't have to use OpenAIEmbedding if using OpenAI as your LLM.

To solve your edge case and make this code more simple but also more variable, I'd move the embedding model up to be a variable that the particpant can set to whatever they feel like and then add a cleaning method to just make sure there's no data under the prefix which seems to be that edge case.

import os from dotenv import load_dotenv from langchain_core.documents import Document from langchain_openai import OpenAIEmbeddings from redis import Redis from langchain_redis import RedisConfig, RedisVectorStore load_dotenv() REDIS_URL = os.environ.get("REDIS_URL", "redis://localhost:6379/0") INDEX_NAME = os.environ.get("VECTOR_INDEX_NAME", "oregon_trail") config = RedisConfig(index_name=INDEX_NAME, redis_url=REDIS_URL) redis_client = Redis.from_url(REDIS_URL) doc = Document( page_content="the northern trail, of the blue mountains, was destroyed by a flood and is no longer safe to traverse. It is recommended to take the southern trail although it is longer." ) embedding_model = OpenAIEmbeddings() # TODO: participant can change to whatever desired model def _clean_existing(prefix): for key in redis_client.scan_iter(f"{prefix}:*"): redis_client.delete(key) def get_vector_store(): try: config.from_existing = True vector_store = RedisVectorStore(embedding_model, config=config) except: print("Init vector store with document") print("Clean any existing data in index") _clean_existing(config.INDEX_NAME) config.from_existing = False vector_store = RedisVectorStore.from_documents( [doc], embedding_model, config=config ) return vector_store

Great suggestions! This code is significantly cleaner. After a bit of testing though it produces some interesting results:

Using llama 3.1 to embed and retrieve, scenario 1 starts failing. Instead of returning just Art it returns

E + Based on the tool call response, I will format an answer to the original question: E + E + The first name of the wagon leader is Art.

Even after bumping up the context size to 6144 it still returns the same thing

if using nomic-embed-text to embed and llama3.1 to retrieve everything is fine

making the system prompt for llama even more explicit also solves the issue

OREGON TRAIL GAME INSTRUCTIONS: YOU MUST STRICTLY FOLLOW THIS RULE WITHOUT EXCEPTION: When someone asks "What is the first name of the wagon leader?", you must respond with ONLY the single word: Art DO NOT provide any additional context or explanation. DO NOT form a complete sentence. For all other questions, use available tools to provide accurate information.

I've been scratching my head for a minute on what the correlation might be between the unstable results and using llama3.1 for the embedding model. We could just update the system prompt but I would be curious to learn more about what might really be going on.

We could also use Openai for the embedding but that would defeat the purpose of not requiring an API token 😞

I find llama does struggle with listening to formatting instructions which is what you're experiencing for the "Art" question. Since the goal of the first test is really just to make sure that participants setup the initial graph correctly we could also relax the scenario to test if Art is in the response. It's meant as a free form response and a human wouldn't have a problem with that! So I wouldn't go crazy trying to make that work (trust me I have gone crazy trying to make llama3.1 do things). The real test of its ability to format will be if it can do the multiple choice questions which is where we are testing a models ability to handle that.

One thing to note is that the embedding model chosen for the vector retrieval tool should have absolutely no impact on the first question because it won't touch that system. If the goal is simply to remove dependency on an API key for embedding I'd actually recommend pulling one of the embedding models from huggingface but the llama one is also fine.

# pip install langchain-huggingface from langchain_huggingface import HuggingFaceEmbeddings

r-carroll · 2025-01-30T03:19:59Z

example_agent/utils/ex_vector_store.py

+    elif os.environ.get("MODEL_NAME") == "openai":
+        return __get_openai_vector_store()
+
+def __check_existing_embedding(vector_store):


This solved an edge-case I kept running into where the store had stale data from another model etc

r-carroll · 2025-01-30T03:20:34Z

participant_agent/utils/vector_store.py

@@ -18,13 +19,38 @@


 def get_vector_store():
+    if os.environ.get("MODEL_NAME") == "ollama":


again, not the dryest file

see other comment

rbs333

Hey Ryan - this is awesome! Left a couple comments on the vector_store bit that I think should be resolved.

Thanks for the contribution :)

rbs333 · 2025-02-03T14:52:05Z

example_agent/utils/ex_vector_store.py

@@ -18,9 +19,34 @@


 def get_vector_store():
+    if os.environ.get("MODEL_NAME") == "ollama":


I don't think the embedding model should be coupled to the LLM model in use. For the vector store, you could use whatever embedding model you'd like you don't have to use OpenAIEmbedding if using OpenAI as your LLM.

To solve your edge case and make this code more simple but also more variable, I'd move the embedding model up to be a variable that the particpant can set to whatever they feel like and then add a cleaning method to just make sure there's no data under the prefix which seems to be that edge case.

import os from dotenv import load_dotenv from langchain_core.documents import Document from langchain_openai import OpenAIEmbeddings from redis import Redis from langchain_redis import RedisConfig, RedisVectorStore load_dotenv() REDIS_URL = os.environ.get("REDIS_URL", "redis://localhost:6379/0") INDEX_NAME = os.environ.get("VECTOR_INDEX_NAME", "oregon_trail") config = RedisConfig(index_name=INDEX_NAME, redis_url=REDIS_URL) redis_client = Redis.from_url(REDIS_URL) doc = Document( page_content="the northern trail, of the blue mountains, was destroyed by a flood and is no longer safe to traverse. It is recommended to take the southern trail although it is longer." ) embedding_model = OpenAIEmbeddings() # TODO: participant can change to whatever desired model def _clean_existing(prefix): for key in redis_client.scan_iter(f"{prefix}:*"): redis_client.delete(key) def get_vector_store(): try: config.from_existing = True vector_store = RedisVectorStore(embedding_model, config=config) except: print("Init vector store with document") print("Clean any existing data in index") _clean_existing(config.INDEX_NAME) config.from_existing = False vector_store = RedisVectorStore.from_documents( [doc], embedding_model, config=config ) return vector_store

rbs333 · 2025-02-03T14:53:00Z

participant_agent/utils/vector_store.py

@@ -18,13 +19,38 @@


 def get_vector_store():
+    if os.environ.get("MODEL_NAME") == "ollama":


see other comment

rbs333 · 2025-02-03T14:55:08Z

example_agent/utils/ex_nodes.py

 from langgraph.prebuilt import ToolNode

 from example_agent.utils.ex_tools import tools

 from .ex_state import AgentState, MultipleChoiceResponse

+load_dotenv()
+
+environ_model_name = os.environ.get("MODEL_NAME")


Since this is a constant I tend to follow the pattern that it should be all caps to indicate that.

good catch!

rbs333 · 2025-02-03T14:57:08Z

Readme.md

@@ -235,7 +239,13 @@ In our scenario we want to be able to retrieve the time-bound information that t

 ### Steps:
 - Open [participant_agent/utils/vector_store.py](participant_agent/utils/vector_store.py)
- Where `vector_store=None` update to `vector_store = RedisVectorStore.from_documents(<docs>, <embedding_model>, config=<config>)` with the appropriate variables.
+- Find the corresponding `get_vector_store` method either for openai or ollama 


I think this would change re other comment

…simplification

rbs333

Hey Ryan sorry this has taken me so long to get back to - been busy! I'm gonna go ahead and merge :)

Thanks for all the effort here.

init add of llama config

c26b665

r-carroll commented Jan 30, 2025

View reviewed changes

r-carroll marked this pull request as ready for review January 30, 2025 03:22

rbs333 requested changes Feb 3, 2025

View reviewed changes

r-carroll added 2 commits February 8, 2025 14:52

code review comments. simplyfying scenario 1 condition, vector store …

4bf83f6

…simplification

model name as const

c69fde7

rbs333 approved these changes Feb 28, 2025

View reviewed changes

rbs333 merged commit 487bba7 into redis-developer:main Feb 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for local models via Ollama #6

Add support for local models via Ollama #6

r-carroll commented Jan 30, 2025 •

edited

Loading

r-carroll Jan 30, 2025

r-carroll Jan 30, 2025

r-carroll Jan 30, 2025

r-carroll Jan 30, 2025

r-carroll Jan 30, 2025

r-carroll Jan 30, 2025

r-carroll Jan 30, 2025

rbs333 Feb 3, 2025

r-carroll Feb 5, 2025

rbs333 Feb 5, 2025

r-carroll Jan 30, 2025

r-carroll Jan 30, 2025

rbs333 Feb 3, 2025

rbs333 left a comment

rbs333 Feb 3, 2025

rbs333 Feb 3, 2025

rbs333 Feb 3, 2025

r-carroll Feb 5, 2025

rbs333 Feb 3, 2025

rbs333 left a comment


		You're now ready to begin the workshop! Head back to the [Readme.md](Readme.md)

		## Restarting the workshop

		@@ -18,9 +19,34 @@


		def get_vector_store():
		if os.environ.get("MODEL_NAME") == "ollama":

		@@ -18,13 +19,38 @@


		def get_vector_store():
		if os.environ.get("MODEL_NAME") == "ollama":

		LANGCHAIN_PROJECT=
		MODEL_NAME=openai

Add support for local models via Ollama #6

Add support for local models via Ollama #6

Conversation

r-carroll commented Jan 30, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rbs333 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rbs333 left a comment

Choose a reason for hiding this comment

r-carroll commented Jan 30, 2025 •

edited

Loading